74 research outputs found

    Dynamic Matrix Factorization with Priors on Unknown Values

    Full text link
    Advanced and effective collaborative filtering methods based on explicit feedback assume that unknown ratings do not follow the same model as the observed ones (\emph{not missing at random}). In this work, we build on this assumption, and introduce a novel dynamic matrix factorization framework that allows to set an explicit prior on unknown values. When new ratings, users, or items enter the system, we can update the factorization in time independent of the size of data (number of users, items and ratings). Hence, we can quickly recommend items even to very recent users. We test our methods on three large datasets, including two very sparse ones, in static and dynamic conditions. In each case, we outrank state-of-the-art matrix factorization methods that do not use a prior on unknown ratings.Comment: in the Proceedings of 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining 201

    Population control of 2s-2p transitions in hydrogen

    Full text link
    We consider the time evolution of the occupation probabilities for the 2s-2p transition in a hydrogen atom interacting with an external field, V(t). A two-state model and a dipole approximation are used. In the case of degenerate energy levels an analytical solution of the time-dependent Shroedinger equation for the probability amplitudes exists. The form of the solution allows one to choose the ratio of the field amplitude to its frequency that leads to temporal trapping of electrons in specific states. The analytic solution is valid when the separation of the energy levels is small compared to the energy of the interacting radiation.Comment: 6 pages, 3 figure

    Fast Differentially Private Matrix Factorization

    Full text link
    Differentially private collaborative filtering is a challenging task, both in terms of accuracy and speed. We present a simple algorithm that is provably differentially private, while offering good performance, using a novel connection of differential privacy to Bayesian posterior sampling via Stochastic Gradient Langevin Dynamics. Due to its simplicity the algorithm lends itself to efficient implementation. By careful systems design and by exploiting the power law behavior of the data to maximize CPU cache bandwidth we are able to generate 1024 dimensional models at a rate of 8.5 million recommendations per second on a single PC

    VEBO: A Vertex- and Edge-balanced Ordering Heuristic to Load Balance Parallel Graph Processing

    Get PDF
    Graph partitioning drives graph processing in distributed, disk-based and NUMA-aware systems. A commonly used partitioning goal is to balance the number of edges per partition in conjunction with minimizing the edge or vertex cut. While this type of partitioning is computationally expensive, we observe that such topology-driven partitioning nonetheless results in computational load imbalance. We propose Vertex- and Edge-Balanced Ordering (VEBO): balance the number of edges and the number of unique destinations of those edges. VEBO optimally balances edges and vertices for graphs with a power-law degree distribution. Experimental evaluation on three shared-memory graph processing systems (Ligra, Polymer and GraphGrind) shows that VEBO achieves excellent load balance and improves performance by 1.09x over Ligra, 1.41x over Polymer and 1.65x over GraphGrind, compared to their respective partitioning algorithms, averaged across 8 algorithms and 7 graphs.Comment: 13 page

    (Un)twisted: talking back to media representations of eating disorders

    Get PDF
    In 2014-15, there were several news reports about a rise in the diagnoses and treatment of eating disorders (EDs), as attributed to the use of image-driven social media. Such coverage can be situated within a long history of concern in which those diagnosed with an ED are constructed as ‘especially vulnerable’ to the power of media images – a subjectivity which is pathologised and devalued precisely through its association with femininity. The most incisive objections to EDs being presented as a response to the ‘weight’ of media representation have come from Abigail Bray (2005) in her work on how anorexia is constructed as a reading as well as an eating disorder. Indeed, there is a whole history of empirical work in Feminist Media Studies and Girlhood Studies which has challenged the pernicious construction of female subjectivity as ‘excessively’ invested in, and ‘damaged’ by, the consumption of mass mediated forms. Yet the media consumption practices of those with experience of an ED have not been subject to similar feminist re-evaluation – an omission which this research seeks to address. In exploring the results of 17 semi-structured interviews with people who have experience of an ED discussing their encounters with media representations of EDs (material that is often co-opted into debates about the ‘toxic’ nature of media culture in this regard), this article seeks to intervene in how such imagined media consumption practices are often defined. In seeking to speak back to historically pathologising constructions, the article seeks to explore the qualitative responses in the context of more ‘every day’ understandings of media engagement, thus working against the gendered othering which has persistently occurred

    Chaos: Scale-out Graph Processing from Secondary Storage

    Get PDF
    Chaos scales graph processing from secondary storage to multiple machines in a cluster. Earlier systems that process graphs from secondary storage are restricted to a single ma- chine, and therefore limited by the bandwidth and capacity of the storage system on a single machine. Chaos is limited only by the aggregate bandwidth and capacity of all storage devices in the entire cluster. Chaos builds on the streaming partitions introduced by X-Stream in order to achieve sequential access to storage, but parallelizes the execution of streaming partitions. Chaos is novel in three ways. First, Chaos partitions for sequential storage access, rather than for locality and load balance, re- sulting in much lower pre-processing times. Second, Chaos distributes graph data uniformly randomly across the clus- ter and does not attempt to achieve locality, based on the observation that in a small cluster network bandwidth far outstrips storage bandwidth. Third, Chaos uses work steal- ing to allow multiple machines to work on a single partition, thereby achieving load balance at runtime. In terms of performance scaling, on 32 machines Chaos takes on average only 1.66 times longer to process a graph 32 times larger than on a single machine. In terms of capacity scaling, Chaos is capable of handling a graph with 1 trillion edges representing 16 TB of input data, a new milestone for graph processing capacity on a small commodity cluster

    X-Stream: Edge-centric Graph Processing using Streaming Partitions

    Get PDF
    X-Stream is a system for processing both in-memory and out-of-core graphs on a single shared-memory machine. While retaining the scatter-gather programming model with state stored in the vertices, X-Stream is novel in (i) using an edge-centric rather than a vertex-centric implementation of this model, and (ii) streaming completely unordered edge lists rather than performing random access. This design is motivated by the fact that sequential bandwidth for all storage media (main memory, SSD, and magnetic disk) is substantially larger than random access bandwidth. We demonstrate that a large number of graph algorithms can be expressed using the edge-centric scatter-gather model. The resulting implementations scale well in terms of number of cores, in terms of number of I/O devices, and across different storage media. X-Stream competes favorably with existing systems for graph processing. Besides sequential access, we identify as one of the main contributors to better performance the fact that X-Stream does not need to sort edge lists during pre-processing

    A global climatology of the mesospheric sodium layerfrom GOMOS data during the 2002-2008 period

    Get PDF
    This paper presents a climatology of the mesospheric sodium layer built from the processing of 7 years of GOMOS data. With respect to preliminary results already published for the year 2003, a more careful analysis was applied to the averaging of occultations inside the climatological bins (10° in latitude-1 month). Also, the slant path absorption lines of the Na doublet around 589 nm shows evidence of partial saturation that was responsible for an underestimation of the Na concentration in our previous results. The sodium climatology has been validated with respect to the Fort Collins lidar measurements and, to a lesser extent, to the OSIRIS 2003–2004 data. Despite the important natural sodium variability, we have shown that the Na vertical column has a marked semi-annual oscillation at low latitudes that merges into an annual oscillation in the polar regions, a spatial distribution pattern that was unreported so far. The sodium layer seems to be clearly influenced by the mesospheric global circulation and the altitude of the layer shows clear signs of subsidence during polar winter. The climatology has been parameterized by time-latitude robust fits to allow for easy use. Taking into account the non-linearity of the transmittance due to partial saturation, an experimental approach is proposed to derive mesospheric temperatures from limb remote sounding measurements

    Ground-Based Assessment of the Bias and Long-Term Stability of Fourteen Limb and Occultation Ozone Profile Data Records

    Get PDF
    The ozone profile records of a large number of limb and occultation satellite instruments are widely used to address several key questions in ozone research. Further progress in some domains depends on a more detailed understanding of these data sets, especially of their long-term stability and their mutual consistency. To this end, we made a systematic assessment of fourteen limb and occultation sounders that, together, provide more than three decades of global ozone profile measurements. In particular, we considered the latest operational Level-2 records by SAGE II, SAGE III, HALOE, UARS MLS, Aura MLS, POAM II, POAM III, OSIRIS, SMR, GOMOS, MIPAS, SCIAMACHY, ACE-FTS and MAESTRO. Central to our work is a consistent and robust analysis of the comparisons against the ground-based ozonesonde and stratospheric ozone lidar networks. It allowed us to investigate, from the troposphere up to the stratopause, the following main aspects of satellite data quality: long-term stability, overall bias, and short-term variability, together with their dependence on geophysical parameters and profile representation. In addition, it permitted us to quantify the overall consistency between the ozone profilers. Generally, we found that between 20-40 kilometers the satellite ozone measurement biases are smaller than plus or minus 5 percent, the short-term variabilities are less than 5-12 percent and the drifts are at most plus or minus 5 percent per decade (or even plus or minus 3 percent per decade for a few records). The agreement with ground-based data degrades somewhat towards the stratopause and especially towards the tropopause where natural variability and low ozone abundances impede a more precise analysis. In part of the stratosphere a few records deviate from the preceding general conclusions; we identified biases of 10 percent and more (POAM II and SCIAMACHY), markedly higher single-profile variability (SMR and SCIAMACHY), and significant long-term drifts (SCIAMACHY, OSIRIS, HALOE, and possibly GOMOS and SMR as well). Furthermore, we reflected on the repercussions of our findings for the construction, analysis and interpretation of merged data records. Most notably, the discrepancies between several recent ozone profile trend assessments can be mostly explained by instrumental drift. This clearly demonstrates the need for systematic comprehensive multi-instrument comparison analyses
    • …
    corecore